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SUMMARY 


This  paper  summarizes  the  approach  and  findings  of  research  into  reliability, 
supportabi 1 ity,  and  survivability  prediction  techniques  for  fault-tolerant 
avionics  systems.  Since  no  technique  existed  to  analyze  the  fault  tolerance 
of  reconf igurable  systems,  a  new  method  was  developed  and  implemented  in  the 
Mission  REliability  Model  (MIREM).  The  supportabi 1 ity  analysis  was  completed 
by  using  the  Simulation  of  Operational  Availability/Readiness  (SOAR)  mode). 
Both  the  Computation  of  Vulnerable  Area  and  Repair  Time  (COVART)  model  and 
FASTGEN,  a  survivability  model,  proved  valuable  for  the  survivability 
research.  Sample  results  are  presented  and  several  recommendations  are  also 
given  for  each  of  the  three  areas  investigated  under  this  study. 


PREFACE 


This  paper  documents  research  into  relia¬ 
bility,  supportability ,  and  survivability  prediction 
techniques  for  fault- tolerant  avionics  and  their 
application  to  Integrated  Communication,  Navigation, 
and  Identification  Avionics.  This  work  is  jointly 
supported  by  the  Air  Force  Human  Resources  Labora¬ 
tory  and  the  Air  Force  Wright  Aeronautical  Labora¬ 
tories.  The  guidance  and  support  of  Messrs.  James  C 
McManus,  Daniel  V.  Ferens  and  Robert  L.  Harris  of 
these  organizations  are  greatly  appreciated. 
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INTRODUCTION 


I . 


The  Impact  Analysis  of  Integrated  Communication, 
Navigation,  and  Identification  Avionics  (ICNIA)  program,  an 
overview  of  which  is  depicted  in  Figure  1,  had  the  follow¬ 
ing  goals: 

1.  Develop  logistics  analysis  methods  that  are 
appropriate  for  design  evaluation  of  integrated,  fault- 
tolerant  systems  early  in  the  development  cycle. 

2.  Investigate  traditional  and  innovative 
maintenance  concepts;  in  particular,  evaluate  deferred 
repair  policies  that  would  exploit  fault  tolerance  to 
increase  sustainability  in  limited  repair  environments. 

3.  Apply  these  techniques  to  the  two  ICNIA 
archi tec tures  under  development. 


4.  Influence  the  ICNIA  designs  to  improve 
reliability  and  supportability . 

5.  Document  the  research  and  development 
results  in  a  form  amenable  for  use  by  design  engineers. 

One  primary  motivation  for  this  research  was  that 
historically,  logistics  engineering  disciplines  have  been 
applied  to  new  avionics  designs  in  the  later  stages  of 
development.  To  ensure  that  avionics  designs  are  reliable 
supportable,  and  survivable  in  the  operating  environment, 
logistics  engineering  techniques  are  needed  that  can  be 
effectively  implemented  early  in  the  development  cycle. 
Use  of  these  techniques  will  allow  design  engineers  to 
provide  for  reliability,  supportability,  and  survivability 
before  the  design  is  fixed. 

Recent  trends  toward  integration  and  fault  toler¬ 
ance  in  avionics  also  create  a  need  for  new  techniques 
that  capture  these  characteristics  and  can  identify  sup¬ 
port  concepts  that  exploit  the  fault- tolerant  nature  of 
the  systems.  In  the  ICNIA  system,  fault  tolerance  will  be 
achieved  through  dynamic  reconfiguration  that  allocates 
common  system  resources  to  a  variety  of  radio  functions 
across  a  wide  spectrum  of  frequencies.  Dynamic  reconfigu¬ 
ration  will  allow  faults  to  be  managed  and  resources  to  be 
effectively  shared  between  required  functions.  Because  of 
its  intended  role  as  the  sole  communication  set  onboard 
tactical  aircraft,  system  reliability  and  fault  tolerance 
are  of  central  importance  to  ICNIA. 


METHODOLOGY 


Reliability  Analysis 

A  survey  of  existing  techniques  led  to  the  con¬ 
clusion  that  no  single  model  in  use  by  DoD  both  considers 
the  complex  relationships  between  system  components  found 
in  ICNIA  and  provides  simple  enough  data  input  structures 
to  deal  with  a  realistic  number  of  system  resources.  As  < 
result,  a  new  method  for  analyzing  fault  tolerance  in  re- 
configurable  systems  was  developed  (Figure  2).  Implemented 
in  the  Mission  REliability  Model  (MIREM)  computer  program, 
this  method  analyzes  a  network  structure  of  functional 
components.  The  Mission  Completion  Success  Probability 
(MCSP)  is  computed  for  a  specified  mission,  requiring  cer¬ 
tain  Communication,  Navigation,  and  Identification  (CNI) 
functions.  Measures  such  as  Mean  Time  Between  Critical 
Failure  (MTBCF)  and  failure  resiliency  (a  measure  of  fault 
tolerance)  are  also  generated. 

Results  for  a  hypothetical  architecture  and  two 
mission  scenarios  are  shown  in  Table  1.  Mission  scenario  1 
requires  all  the  available  resources  in  several  parts  of 
the  system,  resulting  in  very  little  fault  tolerance. 
Relative  to  mission  scenario  2,  however,  the  system  con¬ 
tains  redundant  resources  which  extend  the  Mean  Time  Be¬ 
tween  Failure  (MTBF)  from  224  hours  to  1379  hours  between 
critical  failures  (those  that  cause  mission  failure)  with¬ 
out  repair.  Expressing  this  in  terms  of  failure  resiliency 
roughly  six  failures  can  be  sustained  before  a  critical 
failure  occurs.  These  representative  results  show  that 
fault  tolerance  can  dramatically  prolong  the  operatin 


:  without  repair  o 
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the  system  i 
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redundancy  relative  to  the  mission  requirements 


Logistics  Support  Analysis 

A  logistics  support  analysis  technique  was  sought 
that  would  relate  design  factors  and  support  resources  to 
readiness  without  imposing  data  requirements  that  are  un¬ 
realistic  during  the  early  design  phase.  After  a  literature 
review,  the  Simulation  of  Operational  Availabili tv /Readiness 


X  M  1 

(SOAR  ;  was  selected  for  this  purpose  and  modified  to  ad 


dress  fault  tolerance.  SOAR  is  a  deterministic  mean  value 
simulation  of  the  dynamics  of  aircraft  sortie  and  mainten¬ 
ance  operations  at  a  single  site.  It  considers  the  logis¬ 
tics  parameters  shown  in  Figure  3. 

^SOAR  is  a  trademark  of  The  Analytic  Sciences  Corporation 
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Figure  2.  Mission  REliability  Model 
(MIREM)  Overview. 

TABLE  1.  SAMPLE  RELIABILITY  RESULTS 


MISSION 

1 

2 

^SCENARIO 

THREE  FUNCTIONS 

TWO  FUNCTIONS 

REQUIRED 

REQUIRED 

QUANTITY 

SIMULTANEOUSLY 

SIMULTANEOUSLY 

MCSP  (3  hr  MISSION) 

MTBCF 

MT8F 


249  hrs 
224  hrs 


0.999996 
1374  hr* 
224  hrs 


FAILURE  RESILIENCY 


1.11 


6.15 


OPERATIONAL 
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•  MISSION  AVAILABILITY 

•  SORTKS/AC/DAY 


Figure  3.  Readiness  Methodology  Overview. 


The  effect  of  deferring  repair  of  noncritical 
failures  until  a  critical  failure  occurs,  in  order  to  gen¬ 
erate  more  sorties  without  maintenance  downtime,  can  be 
explored  using  the  model.  Figure  4  quantifies  the  effect 
of  deferred  repair  on  the  maximum  sortie  rate  that  can  be 
supported  by  the  hypothetical  fault-tolerant  ICNIA  archi¬ 
tecture  discussed  above.  Under  two-level  support  and  the 
same  spares  levels,  the  deferred  repair  policy  can  sustain 
3  high  sortie  rate  without  maintenance  actions  much  longer 
than  can  the  immediate  repair  policy,  which  quickly  de¬ 
pletes  the  available  spares. 


Survivability  Analysis 

The  role  of  ICNIA  as  the  sole  communication  set 
onboard  an  aircraft  could  lead  to  the  conjecture  that  ICNIA 
is  more  vulnerable  to  projectile  threats  than  a  suite  of 


Figure  4.  Maximum  Sortie  Generation  by  Repair  Policy. 


discrete  CNI  avionics  would  be.  On  the  other  hand,  volume 
reduction,  through  system  integration,  makes  ICNIA  a 
smaller  target.  The  vulnerability  question  also  includes 
consideration  of  which  Line  Replaceable  Units  (LRUs)  ICNIA 
can  afford  to  lose  and  still  retain  a  specified  capability. 
A  front  end  study  disclosed  that  the  FASTGEN  and  COVART 
computer  programs  are  widely  accepted  for  survivability 
analysis.  However,  a  baseline  CNI  survivability  analysis 
was  not  available  because  CNI  is  generally  considered 
negligible  when  assessing  aircraft  vulnerability.  A  meth¬ 
odology  for  addressing  ICNIA  survivability  was  developed 
(Figure  5),  extracting  relevant  portions  of  the  FASTGEN 
and  COVA&T  programs . 

Preliminary  penetration  assessment  for  typical  LRU 
configurations  in  an  avionics  bay  and  representative  pro¬ 
jectile  threats  revealed  that  penetration  through  two  LRUs 
was  likely.  Since  ICNIA  houses  redundant  components  in  at 
most  two  LRUs,  these  results  indicate  that  CNI  system  kill 
will  be  highly  dependent  on  explicit  protection  concepts 
and  individual  LRU  placement.  The  key  to  survivability  im¬ 
provement  of  integrated  systems  over  discrete  systems  lies 
in  volume  and  weight  reduction,  which  can  be  trans formed 
into  various  protective  measures. 
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III. 


CONCLUSIONS  AND  RECOMMENDATIONS 


The  methodologies  for  assessing  reliability  and 
logistics  support  impacts,  presented  herein,  have  been  ap¬ 
plied  to  the  ICNIA  architectures  during  the  early  stages  of 
design.  They  are  capable  of  analyzing  the  impacts  of  inte¬ 
gration  and  fault  tolerance  in  complex  systems.  Several 
general  conclusions  can  be  drawn  from  the  ICNIA  architec¬ 
tures  analyzed.  For  reliability, 

1 .  Single  components  that  can  cause  system 
failures  (critical  f allures  ),  if  they  exist ,  are  tKe  sin¬ 
gle-  most  important  factor  in  MCSP. 

2.  A  second  level  of  redundancy  (at  the  LRU 
level)  improves  reliability  only  if  all  critical  functions 
are  supported  on  both  of  the  LRUs. 

3.  The  determination  of  which  functions  are 
critical  for  a  mission  and  whether  they  are  required 
simultaneously  can  drastically  affect  MCSP. 

4.  Reconfigurability  (e.g.,  inter-LRU  connec¬ 
tions)  between  components  that  are  already  redundant  does 
not  necessarily  enhance  reliability. 

In  terms  of  supportability, 

1 .  Deferral  of  repair  until  a  critical  failure 
occurs  allows  a  high"  sortie  rate  to  be  sustained  for  a 
longer  period  without  repair"  The  payoff  is  substantial 
for  hignly  fault-tolerant  systems,  particularly  under  a 
two-level  maintenance  policy.  However,  some  penalty  is 
paid  in  MCSP  for  flying  systems  that  contain  failed  com¬ 
ponents  (less  redundancy). 

2.  High  reliability,  deferred  repair  policies 
and  increased  modularity  all  provide  impetus  to  use  two- 
level  rather  than  three-level  maintenance,  thereby  elimi¬ 
nating  expensive  intermediate-level  test  equipment. 

The  developed  techniques  have  the  advantage  of 
not  requiring  highly  detailed  design  and  logistics  inputs 
and  of  being  relatively  streamlined.  The  computerized 
models  are  amenable  to  interactive  use  and  could  be  hosted 
on  a  minicomputer.  As  a  result,  the  techniques  can  be 
applied  early  in  the  design  phase  as  a  design  tool  to  aid 


In  the  survivability  area,  the  reduced  volume  and 
weight  of  ICNIA,  compared  to  discrete  systems,  provide  the 
key  to  decreased  vulnerability  by  transforming  them  into 
increased  protective  measures.  Preliminary  results  suggest 
the  ICNIA  system  kill  will  be  dependent  on  explicit  pro¬ 
tection  concepts  and  LRU  placement;  therefore,  detailed 
analysis  appears  more  fitting  when  actual  installation 
nears . 


Several  areas  of  additional  research  are  suggested 
by  this  study.  The  reliability  model  developed  here  does 
not  include  the  effects  of  incomplete  or  faulty  built-in 
test  coverage,  which  could  cause  incorrect  switching  by 
the  system  controller.  For  highly  fault-tolerant  systems, 
the  effect  of  incorrect  switching  is  likely  to  be  signifi¬ 
cant  .  Software  reliability  and  fault  tolerance,  which 
wTTT  become  increasingly  important  in  these  systems,  are 
also  areas  for  which  further  research  would  prove  useful. 
Maintenance  concepts  that  rely  on  smart  systems  to  schedule 
and  reduce  the  number  of  repair  actions  pose  another  major 
issue.  The  implications  of  attempting  to  institutionalize 
such  a  concept  need  to  be  explored.  Finally,  enhancement 
and  possibly  integration  of  the  models  developed  here  into 
an  interactive,  user-friendly  package  is  recommended  in 
order  to  provide  the  capability  to  those  individuals  who 
most  influence  early  system  design--the  design  engineers. 
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